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Abstract. We present models and soundness results for hybrid information flow, 
i.e. for mechanisms that enforce noninterference-style security guarantees using 
a combination of static analysis and dynamic taint tracking. Our analysis has the 
following characteristics: (i) we formulate hybrid information flow as an end- 
to-end property, in contrast to disruptive monitors that prematurely terminate or 
otherwise alter an execution upon detecting a potentially illicit flow; (ii) our secu- 
rity notions capture the increased precision that is gained when static analysis is 
combined with dynamic enforcement; (iii) we introduce path tracking to incorpo- 
rate a form of termination-sensitivity, and (iv) develop a novel variant of purely 
dynamic tracking that ignores indirect flows; (v) our work has been formally ver- 
ified, by a comprehensive representation in the theorem prover Coq. 

1 Introduction 

Hybrid information flow techniques integrate static analyses with dynamic taint tracking 
or execution monitoring to ensure the absence of illicit flows. In the systems community, 
instrumentations that control direct data flows in assignments have been refined, using 
compile-time transformations or dynamic binary instrumentation, to capture indirect (or 
implicit) flows arising from the branching behavior or aliasing B29I23I14I19I18I . Typ- 
ically, these systems focus on efficient implementability and are justified by informal 
arguments, but lack mathematically precise definitions of the intended security guaran- 
tee or formal soundness proofs. 

The inverse observation applies to work on language-based security. Here, the tradi- 
tional focus on static techniques has recently been complemented by studies of coupled 
or inlined monitors, with different mechanisms for detecting, preventing, or handling 
potentially illicit flows [30 28 27 13 24 20 3 4 22]. Typically, these analyses are illus- 
trated with proof-of-concept implementations that are less comprehensive than those 
of the systems community but are backed up with (pencil-and-paper) soundness proofs 
based on precise definitions of operational models and the intended security guarantee. 

Our long-term goal is to integrate flexible information flow enforcement into frame- 
works such as the Verified Software Toolchain (2)- As a stepping stone, and building 
on formalizations of type systems for noninterference I7I10I1II , we present the first hy- 
brid enforcement that is backed up by an implementation in a proof assistant, Coq |9|. 

* This work was funded in part by the Air Force Office of Scientific Research (FA9550-09-1- 
0138). 
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Our analysis takes inspiration from Rifle [29], one of the first hybrid systems with 
a precise treatment of indirect flows. In particular, we provide an analysis of one of 
Rifle's core ideas, the use of separate taint registers for direct and indirect flows. Un- 
derpinning Rifle's intuitive soundness argument with a formal guarantee, we show in 
which sense the tracking of indirect flows using dedicated join instructions improves 
precision when compared to a more naive system where only data taints are tracked 
dynamically. The latter system in turn is more precise than the standard static system, 
the flow-sensitive type system of Hunt and Sands B16I17II . 

The guarantees recently proposed for inlined Ifl3l or non-inlined Il27ll monitors pre- 
maturely terminate or alter executions upon detecting a potentially illicit flow. Our anal- 
ysis shows that dynamic enforcement may alternatively be understood as an asymmetric 
end-to-end indistinguishability notion that refines (multilevel) noninterference. Here, 
the asymmetry captures the difference between the actual (taint-instrumented) execu- 
tion and hypothetical competitor executions (which may be taint-instrumented or not). 
The resulting notion captures the systems-oriented intuition that each value's final taint 
should carry fine-grained information regarding its (potential) origins. 

Asymmetric interpretations have previously been considered by Le Guernic and 
Jensen B2 11201 . and also by Magazinius et al. If22ll . These works limit their attention 
to two-level security and track direct and indirect flows jointly. In addition, Le Guer- 
nic and Jensen specify the set of observable (final) variables statically. Our analysis 
exposes additional fine-grained structure of taint tracking and highlights that the nonin- 
fluence between taints and the native execution actually refines to a ternary discipline: 
data taints (capturing direct flows) are unaffected by control taints (capturing indirect 
flows), and neither one affects the data plane (i.e. the native execution). 

Finally, we introduce path tracking, by adding a further taint register that collects the 
taints of all branch conditions encountered. Path tracking propagates termination from 
tracked to untracked executions, and also provides a meaningful indistinguishability 
guarantee for pure data flow tracking. In particular, we outline in which sense control 
taints can be safely eliminated without detrimentally affecting the security guarantee. 

Summarizing our contributions, we present an analysis of hybrid information flow 
enforcement that considers fine-grained taint-tracking with separated control and data 
taints. We develop appropriate notions of formal security, prove corresponding sound- 
ness results, and investigate the relative precision of different syntheses. We introduce 
path tracking as an extension of previous instrumentations, obtaining termination-aware 
tracking and an (to our knowledge: the first) extensional interpretation of data flow 
tracking. In contrast to all previous developments of hybrid information flow or dy- 
namic flow tracking, our development is backed up by a formalization in Coq - see |9)- 

In contrast to Rifle's assembly-level formulation, our analysis is carried out for the 
language of structured commands and loops. This setting suffices for studying the in- 
tended security notion and the core aspects of join-based taint instrumentation, but a 
study of additional aspects would certainly be profitable. Of particular interest in this 
regard would be Rifle's treatment of memory, which combines a reaching-definitions 
analysis with external aliasing analyses. However, the exploitation of aliasing infor- 
mation in Rifle's synthesis is such that use-sites of taint registers are not necessar- 
ily dominated by their def-sites, prohibiting a proof of soundness along the program 
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structure. For this reason, our synthesis uses a different allocation policy for taint reg- 
isters. A future integration of memory may allow us to study this aspect in more detail 
and may take additional inspiration from the recent work of Moore and Chong ll24l . 

2 Static Flow- Sensitive Information Flow 

We start by fixing some notation and revising aspects of Hunt & Sands' static analysis. 

2.1 Native Language 

For disjoint sets X of program variables and V of value constants, our language con- 
cerns expressions £ and commands C according to the grammar 

e 6 £ ::= x \ v | e © e 

C eC ::= skip \x:=e \ C; C | if e then C else C | while e do C 

where x, y, . . . e X , v, w, . . . e V and © ranges over binary operations. The set of 
program variables possibly modified (i.e. assigned to) in some command C is denoted 
by MV(C). Expression evaluation s h e JJ, v and (big-step) command execution s — > 
t are formulated over stores s,t, . . . from the space S = X — » V. Throughout the 
paper, update operations of various total or partial functions are written as .[. i— > .], and 
lookups as .(.) or simply juxtaposition. We denote empty lists (over various types) by e, 
list append by @, and list prefixing by ::. The definition of these auxiliary notions, and 
of the operational judgements are entirely standard and hence omitted. 

We formulate our analysis generically over a semilattice C with partial order E: C — » 
C — » Prop, least upper bound u : C — » C — > £, and bottom element _L. We typically 
write the binary operators in infix position and extend u to subsets of £, with u0 = _L, 
and also to £* (i.e. finite lists over C), with ue = _L. Example programs use the ternary 
semilattice low c mid c high and silently name variables according to their typical 
initial taint: I : low, m : mid, h : high. Of course, the static level and the dynamic taint 
of a variable may differ from this at different program points. 

2.2 Flow-Sensitive Type System in the Style of Hunt and Sands 

The starting point of our analysis is the flow-sensitive type system for multilevel non- 
interference by Hunt & Sands I16I17B . Its algorithmic formulation (given in ifTTIl ) em- 
ploys judgements I- {r}C{A} where the contexts r, A associate security elements 
to all program variables and also to a distinguished pseudo-variable pc $ X. We de- 
note the restriction of context r to the program variables by \r\ and let G,H, . . . range 
over such pc-erased contexts. Lattice constants and operations are lifted to erased or 
unerased contexts in the standard pointwise fashion. Label evaluation \e~\c is given by 

lx\ G = G(x) Hg = 1 {ei © e 2 ] G = [ ei ] G u [e 2 ] G 

and is extended to non-erased contexts via [ejr = [e]in. The proof rules for h 
{r}C{A} are summarized in Figure [TJ with an explicit construction of the (least) fixed 
point context in rule HS-While. We write V h {r}C{A} to refer to a particular 
derivation for I- {r}C{ A}. It is easy to see that I- {r}C{A} implies 
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HS-COMP^^f-^f^ HS-Ass^^" f (pc)u[e]r] 
h{ri}Ci;C 2 {r 3 } h{r}x:=e{A} 



HS-SKIP ; -r . . T-r HS-lTE 



p=r(pc) r' = r[pc^pule]r] 
Vie {1,2}. h {r'JC^A} 



h {r}skip{r} h {r}if e then Ci else C 2 {(/Ai u A 2 ) [pc i-> p]} 

v i < n. ^ r i+ i v i ^ n. r, = r,+i p = r( pc ) 

To = T[pc h P u [e]r] Vi. h {r,}C{A} 

Vi. r 1+ i = (A u r)[ pc i > (A u r)( pc ) u [ e ] 4u r] 



HS-While 



h {r} while e do C"{r n [pc i-> p]} 
Fig. 1. System h {F}C{A} by Hunt & Sands with explicit (least) fixed point in rule HS-WHILE 



- r( P c) = A( P c) 

- A = A' whenever h {r}C{A'} (functionality of H. & S. -typing) 

- A' E A whenever h {r'}C{A'} and fcf (monotonicity of H. & S.-typing) 

- zi(pc) E Z\x for a; e MV(C), and Zia; = for x $ MV(C). 

Furthermore, we have the following: 

Lemma 1. For h {-T}C{Z\} ancf s — > i, any x e X satisfies A(pc) E Z\x or 
sx = a r'x E Zix. 

Following common practice, our security notions are formulated using indistinguisha- 
bility relations over stores, given some threshold k e C: 

Definition 1. States s and s' are G '-indistinguishable below k e C, notation s =c K s '. 
if all x with Gx E n satisfy sx = s'x. Command C is n-secure for G and H, notation 
\= K {G}C{H }, if s =^ K s' implies t =^ K t' whenever s — > t and s' — » t'. 

We note that for k' E k and G E G', s ='H K s' implies s =^ K , s' (monotonicity of 
indistinguishability). Soundness of I- {r}C{A} is then given by the following result. 

Theorem 1. // h {r}C {A} then \= K {\r\}C{\A\} for any k. 



3 Data Flow Tracking 

The first step towards a more dynamic regime is a language where only direct flows 
(from e into x in assignments x:=e) are tracked. Indirect flows are treated by program 
annotations based on the static analysis, so that the taints of all affected assignments are 
incremented by the branch conditions' static security levels. While being less permis- 
sive than the system we will develop in Section|4j this language suffices for motivating 
our formulation of soundness and discussing some aspects of the program synthesis. 
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3.1 Language with Decorated Assignments 

The category of taint-instrumented commands T (typically ranged over by T) agrees 
with that for native commands C except that assignments take the form [A] x:=e where 
Ae£*. We denote the native command arising from recursively erasing all decorations 
from assignments by |T|. 

Operationally, the taint-extended language manipulates states a, r, . . . that are pairs 

of stores s and (erased) context^] G. Command evaluation a —>j r is defined in parallel 

c 

to the native s — > t, with the exception that an instrumented assignment [A]a;:=e 
additionally updates the taint for x to the lub of A and the taint of e (given by the lub of 
the taints associated with the variables in e): 

T-Ass — sh e ^ v I e l G = a u A = 1 



(s, G) [A] " : ~ e > T (s[x i — * v],G[x ^(lu a)]) 



As a consequence, the language satisfies the following simple erasure property: for any 
G, s t is equivalent to 3H. (s, G) (*, H). 



3.2 Program Synthesis 

Given a program C, the task of the program synthesis is to generate a program T that is 
suitably annotated for noninterference and is functionally equivalent to C. To this end, 
a synthesis must achieve two tasks: first, assignment annotations must be derived that 
correctly account for all implicit flows. Second, assignments in conditionally executed 
code regions must be counter-balanced so that no information leaks from the execution 
or non-execution of a particular program path. This is achieved by lifting the taints of 
all variables potentially modified in a piece of code to (at least) the taint of any branch 
condition enclosing the code fragment, using compensation code: 

Definition 2. For n e C, command C, and an (arbitrary) enumeration x%, . . . ,x n of 
MV(C), we define CompCd K .(C) to be [k] X\:=X\\ . . . ; [n] x n :=x n . 

Thus, CompCd K (C) lifts the taints of all Xi to at least k without changing their data 
values. Taints already above k are also unaffected. While the term compensation code 
appears to have been coined by Chudnow and Naumann [ 1 3 J , the concept itself is al- 
ready present in Rifle [29| and Venkatakrishnan et al.'s work [30], and represents an 
extensional, non-disruptive alternative to the policy of no sensitive-upgrade 113131 - 

The synthesis of annotated code is now defined on the basis of a derivation T) h 
{r}C{A}. Figure |2] defines the synthesis of instrumented code 9(1), C) by case dis- 
tinction on C, with recursive reference to the subderivations of T>. 

The annotation [F(pc)] in assignments models the indirect flows from enclosing 
branch conditions into assigned variables, as statically approximated by context r. Di- 
rect flows from variables in e into x are dealt with (in a dynamically precise manner) 
by the second side condition of the operational rule T-Ass and thus need not be taken 

1 This treatment is equivalent to annotating values with their taint directly, i.e. to a model with 
stores that map registers to tainted values. 
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c 


0{V, C) 


where . . . 


skip 


skip 




x:=e 


[r(pc)] x:=e 








Vi h {ri}Ci{r i+ i}, 
(r,^) = (A,r 3 ) 


if e then Ci else C2 


if ethen0(Di,Ci) else 0(V 2 , C 2 ); 
CompCd K (Ci; C 2 ) 


k = T(pc) u \e\r 


while e do C' 


while e do 9(D n , C); 
CompCd K (C"') 


v n h {r n }c"{z\ n } 

« = T(pc) u [e]r„ 



Fig. 2. Synthesis of taint-instrumented code given a derivation 2? |— {_T}C{Z\}. Items are named 
in accordance with Figure [T] in the cases for composition and conditionals, the derivations X>; 
refer to the subderivations for the respective subphrases d (i 6 {1, 2}). Similarly in the case for 
loops: n is the index where the fixed point iteration stabilizes (cf. Figure[TJ. 



into consideration during program generation. The clauses for composition, condition- 
als, and loops assemble the recursively generated code fragments, adding compensation 
code in the latter two cases. In the case for loops, it suffices to add compensation code 
as a loop epilogue. We note that the given taint k not only dominates -T(pc) u [e]r„ 
but also all [e]r s (i < ri), by the monotonicity of typing. 

Taint-tracking respects the static analysis in the following sense: 

Lemma 2. ForV h {r}C{A} and (s,G) e{V ' C \ T (t,D),G E \r\impliesD E \A\. 

A typical case where the claim in Lemma [2] holds strictly is the (native) program 
if m then x:=3 else x: = h, where r = [h : high, m : mid, pc : low]. For G = \T\ 
we have Dx = mid c high = Ax whenever s is such that s I- ra | true. 

Functional equivalence between 6(1), C) and C follows from the above erasure prop- 
erty and the fact that compensation code does not affect the data plane. We now turn 
our attention to the noninterference guarantee of taint tracking. 

3.3 Interpretation and Soundness 

Intuitively, the tag associated with a value in a final state of a taint-enhanced execu- 
tion indicates the initial values it may have been affected by. Thus, the guarantee is 
execut /on-oriented rather than program-classifying [21 j. In contrast to monitor-based 
formulations that preemptively terminate executions in case of a potential security vi- 
olation, this intuition treats taints in an end-to-end fashion, but is nontrivial only if at 
least the instrumented execution terminates. In order to clarify its relationship with non- 
interference, we give below (Definition |4| a formal reading of our intuition, capturing 
the asymmetry by designating the tainted (and terminating) execution as the lead or 
major execution, and considering the second (tainted or untainted) execution the minor 
or competitor execution. Typographically, we distinguish major from minor executions 
by consistently using primed entities for the latter. Part of the intuitive reading then is 
that the terminal tags of lead executions determine which minor executions need to be 
considered, separately for each final value. 

As a preliminary step, let us first consider the following symmetric and static notion. 
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Definition 3. Program T is statically (G, _ff)-secure if for (s,G) — >j (t,D) and 
(s' , G) —*j (t' , D') and for each x, s =^Hx s ' implies tx = t'x. 

By quantifying over x outside of the implication, this notion captures explicitly that 
final values in x could only be influenced by certain initial values, namely those held in 
variables y with Gy c Hx (note that the executions agree on their initial taint state G 
here). Indeed, the static type system ensures static security: 

Theorem 2. ForV h {r}C{A}, 6(V,C) is statically (\r\, \A\)-secure. 

Thus, indistinguishability of s and s' (w.r.t. below | A\ (x) suffices for guaranteeing 
tx = t'x. In fact, it is easy to see that static security reformulates universal K-security 
pointwise for each variable: 

Lemma 3. For any G, H, s, s', t, t', the following are equivalent: 

1. for all x, s =^ Hx s' implies tx = t'x (the clause in Definition^ 

2. for all k, s =^ K s' implies t =^ K t' (the clause in Definition^. 

Applying this lemma to the case where s — > t and s — > t for some C (and using 
erasure) yields that Theorems [2] and Q] are equivalent - the universal quantification over 
K in the latter and the pointwise formulation in Definition [3] are equally powerful. 

The effect of the dynamic taints is exploited (and the asymmetry emerges) if we 
refine Definition [3] as follows, i.e. instantiate H by the dynamic final taint map D: 

Definition 4. T is dynamically G-secure if for (s,G) — >j (t,D) and (s ,G) — >t 

(t' , D') and for each x, s =E:rj x s ' implies tx = t'x. 

Now, the final dynamic taints of the major execution, held in D, determine whether 
s and s' are indistinguishable, instead of the static \A\ as before. As Dx E |Z\|(x) 
holds for each x, this change relaxes the condition on competitor states s' to s and 
hence admits more minor executions for consideration. Indeed, using Lemma|2]and the 
monotonicity of indistinguishability one may show that dynamic G-security strengthens 
(i.e. implies) static (G, H) -security, with strict inequality again holding for any variable 
x for which the type system performs a strictly approximate lub-operation at some 
control flow merge point (cf. the example at the end of Section [3~2l) . 

For x e MV(G), the final taint Dx necessarily dominates the taint of all those branch 
or loop conditions encountered during the lead execution whose body (whether exe- 
cuted or not) contains an assignment to a variable that (directly or indirectly) flows into 
x. Hence, these conditionals necessarily evaluate identically in the minor execution. 

Like static security, dynamic security may also be expressed using universal quan- 
tification over security levels n: by Lemma[3j Definition|4]is equivalent to the guarantee 
that for (s,G) — *t (t,D) and (s',G) {t',D'), s =g K s' implies t =g K f, for 
any k. In this formulation, the formal asymmetry and the increased precision of dy- 
namic tracking emerge in the final state indistinguishability relation, again via the use 
of D rather than D' or A. Similar comments apply to the results in the remainder of 
this article: we may always reformulate these results as K-security for appropriate final- 
state-indistinguishability relations determined by the final taints of the major execution. 

Again, dynamic security is satisfied by synthesized programs: 
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Theorem 3. T> h {r}C{A}, 0(D, C) is dynamically \r\-secure. 

For the proof of Theorem [3] (and similarly for a direct proof of Theorem [2] that avoids 
TheoremQ]and LemmaO, one shows the following generalization, by induction on T>. 
The proofs for conditionals and loops involve case splits on x e MV(C), and the proof 
for loops proceeds by induction on the operational judgement of the lead execution: 

Lemma 4. Suppose V h {r}C{A} and T = 0(V,C). Let (s,G) (t,D) and 

(s',G r ) ^> T (t',D'% where G E \T\ and G' E \T\. Then each x with Vy. Gy E 
Dx —> (sy = s'y a Gy = G'y) satisfies tx = t'x a Dx = D'x. 

The indistinguishability conditions on the taint components guarantee that no informa- 
tion leaks via the taints themselves. The formulation of Lemma [4] extends the notions 
used by Magazinius et al. and Le Guernic and Jensen [21 20 22 1 to multilevel security, 
and avoids a static classification of variables according to their observation level. 

A result similar to Lemma [4] may also be obtained for differently instrumented pro- 
grams T = 8(T>, C) and T' = 6(D', C) originating from the same native C, where 

v h {r}c{A}, v h {r'}C{A'}, rcr'GE \r\ and a e \r'\. 

On the other hand, Theorem[3]and the erasure property yield the following relation- 
ship between taint-instrumented lead executions and native minor executions: 

Corollary 1. ForV h {r}C{A} and G = |r| let (s, G) e(T> ' C \ T (t,D) and s' 
t' . Then, each x with s =c_d x s ' satisfies tx = t'x. 

In the following section, we will derive a guarantee similar to Corollary[T|for a synthesis 
that also tracks implicit flows dynamically (Theorem|Ul. 

4 Taint Tracking with Control Dependencies 

As discussed above, lowering the pivot k that governs the indistinguishability of initial 
states strengthens the guarantee enjoyed by a taint-instrumented program. Rifle's in- 
strumentation pushes this process further, by replacing the assignment-decorating taint 
constants obtained from typing derivations by taint variables, and by complement- 
ing them with additional security registers that can be explicitly manipulated using a 
novel instruction join. The gain in precision arises from inserting join-instructions in 
such a way that the taint variables are upper-bounded by the constants. In effect, in- 
direct flows are statically converted into additional direct flows that are then tracked 
dynamically. 

In this section we treat a join-extended language motivated by Rifle's insight, but 
apply a synthesis that avoids assignment annotations. Treating Rifle's taint variables 
as a subcategory of security registers, we obtain a slightly simpler model that is closer to 
the regime of Venkatakrishnan et al. Il30ll . In order to emphasize their slightly different 
roles, we track data taints operationally separately from control taints, combining them 
only when formulating the soundness results. 

The development is motivated by the following example. 
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Example 1. For { } Q £ and Jo(pc) E k we have 

h {-H)}if e then x:=m else x: = h{ri] \- {A}if % then y:=w else skipj/^} 
I- {/o}(if e then x:=m else x:=h); if x then y:=w else skipj/^} 

where 2q = [m : mzci, /i : high, e : n,y : K y , v : k v ], J\ = To[a; h» k u ?tmg? u high], 
and = A [y i-» k u mid u /ug/i u n y u /c„] . Consequently, synthesis # annotates 
the assignment y:=v in the second conditional as [k u /iz^/i] y:=v, leading to taint level 
k v u k u for y whenever this branch is taken. However, for runs from initial states 
s with Je] s = true the taint k u mid for ir would suffice, as any competing initial state 
s' indistinguishable from s below n u mid necessarily follows the same execution path. 
Thus, instead of using the static level of the branch condition x when annotating y:=v, 
we'd prefer to use the dynamic one. 

In order to replace the use of the static level of branch conditions with suitable dynamic 
taints, we introduce a fresh category of security registers Z (typically ranged over by z, 
with a ranging over Z*), and extend the category of commands via 

J e J ::= C | z:=join [ei, . . . , e m ] [z 1} z n \. 

The language operates over triples fi = (s, G, M) where M£2-»£ associates lattice 
elements to security registers. We define the judgement form fj, —>j v by embedding 
the rules of s — > t for all instruction forms other than assignments, and adding the rules 

j_ Ass g I- e II v Ma = a 

(s, G, M) >j (s\x h-» v], G[x h-» a], M) 
j JolN j = u?=iM(*«) u u£iMg iV = q 
( S) G, M) z:=joi " [ei -- e " ] '' : ' ' ;) ( S , G, N) 

Assignments leave the control taints unaffected and update the data taints only based 
on other data taints, join-instructions combine data and control taints to modify the 
latter but leave the former unchanged. Indeed, it is easy to show formal noninterference 
results for fi —>j v which express that control taints do not affect data taints and that 
neither control nor data taints affect the data plane. 



4.1 Synthesis 

The synthesis of join-instrumented programs employs two kinds of security registers. 
First, for each x e X, we introduce a security register z x , intended to hold the implicit 
flows into x, leaving G to track the direct flows. Second, in order to capture the effect of 
pc, we introduce security registers z % (i e Af), which will be allocated in a stack-based 
fashion according to the nesting of conditionals and loops. For program expression e 
we write z(e) for the formal expression resulting from substituting each x in e with z x . 
Similarly, for F £ Af* , F = [i\, . . ., i n ], we write z{F) for the list [z 11 , . . . , z ln \ 

A native command G is translated the into join-instrumented program t(e, G) via 
the rules in Figure|3j where F 6 Af* and Join(i, G) is given by 



£ Xl :=join e [z\ z Xl ]; . . .; z^^join e \z % , z Xn ] 
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c 


c(F,C) 


where . . . 


skip 


skip 




x:=e 


x: = e; z^join e (z(e)@z(F)) 




C\; C2 


t(f,Ci);i(F,C 3 ) 




if e then Ci 
else C2 


z l :=join [e] a; 

if e then t(i :: F, Ci); Join(i, C2) 
else t(i :: F, C2); Join(i, Ci) 


i $ F 


while e do C 


2 l :=join [e] a>i; 

while e do i{i :: F, C); Join(i, C')\ ; 
z l :=join [e] «2 

Join(i,C") 


i$F 
ai = z(F)@z(e) 
«2 = z % :: Qi 



Fig. 3. Synthesis of join-instrumented programs t( F, C) 



where x%, . . . , x n is an (arbitrary) enumeration of MV(C). 

Mirroring the effect of the annotations A in the previous section, the translation of 
x:=e updates z x with the lub of the security registers associated with the variables in 
e and the dynamic taints of the enclosing branches as modeled by F. Composite state- 
ments C\ ; C2 are translated compositionally. Discarding any additions that may have 
been applied to F in l(F, Ci), code l(F, C2) may well reuse some register z % already in 
use in l(F, Ci) (but not in F): the synthesis ensures that the liveness ranges of such z l 
do not overlap, hence the variables in effect are distinct. Indeed, the translations of con- 
ditionals and loops initialize newly allocated z l in their first instruction, by combining 
the data taint of the branch condition with its control taint z(e) and the surrounding con- 
trol flow taint z(F), refining the use of -T(pc) in Figure[2] Bodies of conditionals and 
loops are translated compositionally by pushing the register i onto F, and are extended 
with compensation code using Join. Optimizing the behavior slightly in comparison 
to Figure |2j compensation code in conditionals is only added for variables modified in 
the opposite branch (in principle, adding compensation code for MV(C2)\MV(Ci) in 
Ci suffices, and similarly for C^)- Additionally, a loop body updates the security reg- 
ister z\ thus propagating the taints of loop-controlling variables to the next iteration. 
By including i in ol%, we ensure that z % monotonically increases, i.e. that information 
is appropriately propagated to later iterations. Finally, we add compensation code in a 
loop epilogue, ensuring that no information leaks from loops that are never entered. 

Some simple properties of i(F, C) are as follows: 

Lemma 5. Let (s, G, M) l(F ' C \ j (t,D,N). Then (i) M(z x ) = N(z x ) for all x $ 
MV(C), (ii) M(z l ) = N^) for all i e F, and (Hi) N(z l ) E N(z x )forx 6 MV(C) 
and i 6 F. 

Example 2. Revisiting Example[TJ we see that synthesis l(F, C) generates code 
z°:=join [x] [z x ]; 

if x then y:=v; z y :=join e [z(v), z°] else z^^join e [z°, z v ] 

for the second conditional, where z° is a fresh taint register. The first instruction sets 
z° to the lub of the (dynamic) data and control taints of the branch condition x. In the 
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positive branch, this taint is then propagated to the control taint of y (together with the 
control taint of v). The negative branch is equipped with compensation code, lifting z v 
to at least z°. The code generated for the first conditional amounts to 

z°:=join [e] (z(e)); 

if e then x:=m; z^^join e [z(m), z°] else x: = h; z^^join e [z(h), z°] 

where we have silently eliminated the compensation code z x :=join e [z°, z x ] that is 
formally appended to both branches, based on the observation that compensation code 
is redundant for variables modified in both branches. In particular, a run starting in 
(s, G, M) with \e\ s = true only lifts z x to the data and control taints of e and m, 
and hence correctly guarantees final-state-indistinguishability w.r.t. variable y for any 
execution starting in some state s' with s =£?„ , ■ ■ s'. 

The interpretation of join-instrumented executions combines the taint components for 
direct and indirect flows. We write G A M for the map that sends each program vari- 
able x to Gx u M(z x ). We have proven results similar to those in Section[3j details 
are available in our Coq development [9]. In particular, the following result shows the 
agreement between an instrumented and a native execution, in the style of Corollary Q] 

Theorem 4. Let (s, G, M) ' (fiC) > j (t, D, N) and s' ^ if. For any x, s =^dan)x s ' 
implies tx = t'x. 

The proof proceeds by induction on C, again with case distinctions on x 6 MV(C) in 
the cases for conditionals and loops, and an induction on the (instrumented) operational 
judgement in the case for loops where x e MV(C). 

Furthermore, the execution of l(F, C) respects the static typing: 

Lemma 6. For h {r}C{A} and (s,G,M) l(f ' c \ j (t,D,N) let u lEF M(z l ) E 
r(pc) andG AAfE Then D A N E \A\. 

In fact, i(F, C) is more precise than 0(D, C): 

Theorem 5. For V h {r}C{A}, u lEF M(z l ) E r(pc), and G A M E \r\ let 

(s,G,M) t(F ' C) > j (t,D,N) and (s',G A M) e{V ' C \ T (t',D'). Then each x with 
s =^foisN)x s ' satisfies tx = t'x A (D A N)x E D'x. 

Example|2]is a typical case where l(F, C) is strictly more precise than 6(T>, C). 

5 Path Tracking 

The previous sections focused on termination-insensitive security, a notion that is triv- 
ially satisfied whenever either execution fails to terminate. We now extend the synthesis 
so that termination is instead propagated from lead to minor executions. 

The extension rests on the observation made in Section [331 that the execution paths 
taken by minor executions are to a large extent determined by the major execution, via 
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the relationships between the taints of dynamically encountered conditionals and the 
final taints of possibly-assigned variables. The exception to this rule are cases where 
the final value of a variable a; in a major execution is independent from all assignments 
in a conditional, as in (if m then y:=2 else y:=3); x:=5, or indeed 

(if m then y:=2 else while true do skip); x:=5. 

In both cases, the final low taint of x does not constrain the value of m : mid in a 
competitor initial state s'. Hence, competitor executions may follow different program 
paths. The same is true for cases where x $ M V(C). 

We modify synthesis i(F, C) by adding a further security register, z pc that collects 
the taints of all control-flow affecting expressions encountered during a (lead) run, in ef- 
fect tracking the (decisions determining the choice of) execution path. Figure[4]presents 
the resulting synthesis £(F, C). At each branch point, the taint held in z pc is incre- 
mented by the direct and indirect taints of the control-flow affecting expression. 



c 




where . . . 


skip 


skip 




x:=e 


x:=e; z x :=join e (z(e)@z(F)) 




Ci ; C2 


t(F,Ci);£(F,C a ) 




if e then Ci 
else C2 


z pc :=join [e] at ;2*:=join [e] a; 


i $ F 

a = z(F)@z(e) 


if e then £(i :: F, Ci); Join(i, C2) 
else {(i :: F, C 2 ); Join(i, Ci) 


a t = z pc :: z(e) 


while e do C" 


z pc :=join [e] at ;z':=join [e] ai; 


1 $ F 
Ql = z(F)@z(e) 
o?2 = z % :: a± 


while e do £(i :: F, C); Join(i, C); 


z pc :=join [e] a t ;z l :=join [e] a 2 ; 


a t = z pc :: z(e) 


Join(i, C") 



Fig. 4. Path-tracking synthesis (,(F, C). Differences to Figure|3]are marked . 



The following result combines the termination assurance with a claim similar to 
Theorem|4j Note that s and s' are still compared below (D A N)x rather than below 
the weaker 7V(z pc ) u (D A N)x. Indeed, (D A N)x c N(z pc ) typically holds 
whenever the most secret branch is encountered after the last assignment to x. 

Theorem 6. Let (s, G, M) ;(fiC) > j (t, D, N) and s =^ pc) s'. Then, there is some 
t' with s' — > t', and we have tx = t'xfor any x with s =c?e> [ &n)x s '- 

In the examples above, the lead executions for s h tn | true yield N(z pc ) = (G A 
M)m u M(z pc ). Thus, minor executions starting in states s' with s =^.^ I zPC ^ s' 
necessarily satisfy s'(m) = s(m), follow the same execution paths and hence terminate. 
Synthesis £(F, C) agrees with t(F, C) on all taints other than z pc : writing M m M' 

if Mz = M'z for all z ^ z pc , we have that for (s,G,M) C(f,C) > j (t,D,N) and 
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TS-SKIP- 



h 9 {r}x:=e{A} 



TS-lTE- 



\- q {r 1 }c 1 ;C 2 {r 3 } 
P = r( P c) r = r[ pc » pu jej r ] 



vie {1,2}. h q {r'}a{Ai} 



h 9 {r}skip{r} {r}if ethenCi elseC 2 {(/ii u Z\ 2 )[pc ^ p]} 

V i < n. r» ^ V i n. A = li+i p = -T(pc) 



r„ = r[ pc p u [ e ]r] Vi. h, {r^cjA} r„( P c 



TS-While- 



vi. r i+ i = (A u r)[ pc i > (A u r)( pc ) u H 4ur ] 



{T}while e do C*{r„[pc h-> p]} 
Fig. 5. System \- q {r}C{A}. Differences to system h {r}C{A} are 



marked 



c 


5(C) 


skip 


skip 


x:=e 


a;:=e 


Ci ; C2 


<5(C*i);<5(C* 2 ) 


if e then Ci else C2 


z pc :=join [e] [z pc ] ; if e then <5(Ci) else 5(C 2 ) 


while e do C" 


z pc :=join [e] [z pc ]; while e do (5(C); z pc :=join [e] [z pc ]) 



Fig. 6. Synthesis of termination-sensitive data-tracking programs 5(C) 



(s,G,M>) 



i(F,C) 



j (t',D',N'), M x M' implies t = t',D = D' and N « N' . In 



particular, the claim in Theorem[5]remains valid if we replace l(F, C) by £(F, C). 

The static counterpart to path tracking is an extension of the type system from Fig- 
ure Q] to a system with judgements \- q {r}C{A}, where q e £ represents a (static) 
upper bound on the taints of branch conditions. We give the rules of this system in Fig- 
ure 0 and note that \- q {r}C{A} implies h {r}C{A} and also \- p {r}C{A} for 
any p 3 q. A similar type system is given by Hunt and Sands [17|, but not linked to 
dynamic taint tracking as we do in Theorem [7] below. 

The property guaranteed by the static system is equitermination between executions 
starting in initial states that are ^-indistinguishable below q: 



Lemma 7. Let \- q {r}C{A} and s 



s'. Then It.s^t iff 3 f. s' ^ t'. 



Note that equitermination is symmetric and taint ignorant, i.e. applies to pairs of native 
executions (but can be extended to tainted executions by the erasure lemma). 
Path tracking refines the static termination analysis, extending Lemma[6] 



Theorem 7. For u l€F M(z i ) E T(pc) let \- q {r}C{A} and (s,G,M) 
(t,D,N). IfG A M E \r\ andM(z pc ) E q then D A N E \ A\ andN(z pc 



Interestingly, path tracking can also be carried out in the absence of control flow taints, 
i.e for pure data flow tracking. To this end, define the synthesis 5(C) by erasing from 
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£(F, C) all join instructions (including those in Join) that define taint registers other 
than z pc , and modify instructions z pc :=join [e] a to z pc :=join [e] [z pc ] (see details 
in Fig. [6]). The resulting programs ignore all control taint registers other than z pc , but 

behave exactly like £(F, C) on data taints and the data plane: for (s, G, M) S< * c \ j 

(t, D, N), we have M « N, and (t,D) = (t',D r ) whenever (s,G,M') C(fiC) > j 
(£' ', D', AT'). Writing [pj for the control taint component that sends z pc to p and all 
other z to _L, we also have that 5(C) stays below £(F, G): 

Theorem 8. Let (s,G,[p\) (t,D,N) and (s,G,M') i{F ' C) > j (t',D',N'). 

Then, N = [q\ for some q 3 p, with N E N' whenever p E M'(2: pc ). /« particular, 
u =c K u ' coincides with u v! and implies u =®^ N u', for all u, u', and k. 

Even for arbitrary M, 5(C) enjoys termination and indistinguishability: 

Theorem 9. Let (s, G, M) (t, D, N) and s =^ Ar(2PO) s'. Then, for any G' and 

M', there are t', D' , and N' such that (s' , G', M') (f , D' , N'). Furthermore, 

tx = t'x holds for any x with s =^ Dx s', and Dx = D'x holds additionally whenever 
all y with Gy E Dx satisfy Gy = G'y. 

Finally, we transfer Theorem |9js conclusion to native executions and express security 
as an implication over multilevel indistinguishabilities using Lemma[3] 

Corollary 2. Let (s,G,M) (t, D, N) and s =£ N(xPB) s'. Then, s' ^> t' for 

some if, and for all n, s =^ K s' implies t =^ K t'. 

To our knowledge, this represents the first extensional interpretation of data tracking. 



6 Discussion 



We presented an analysis of hybrid information flow by proving selected instrumenta- 
tion schemes sound with respect to RlFLE-inspired interpretations of taint tracking. 

Building upon Moore & Chong's analysis ll24l . we envision that our analysis can be 
extended to memory operations if side effects are tracked at the level of memory ab- 
stractions, for example by introducing one taint register (and associated compensation 
code) per region. Additional future work includes the support of (infinite) computations 
with output, a more detailed study of path tracking, and a comparison of taint tracking 
with Boudol's formulation of security as safety property 11111211 . 

Jee et al. [18| propose taint flow algebras as a generic framework for transferring 
traditional compiler optimizations to byte-level taint tracking. Fine-grained tracking 
below the level of words does not appear to have been studied in the language-based 
security community yet, but a unifying treatment of taint- and native optimizations may 
potentially emerge in explicitly relational formulations [8], extending Moore & Chong's 
use of two-level noninterference for selective monitoring. 

Nanevski et al. [25 | employ dependent types and relational Hoare Type Theory to 
enforce information flow and access control policies, although program verification is 
mostly carried out manually, by interactive verification in Coq. 
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Finally, separating control and data taints from each other and from the data plane 
appears in principle compatible with multicore execution, if the respective instruction 
streams are mapped onto different cores: as communication is orchestrated in an acyclic 
fashion, efficient loop pipelining may be enabled Il26ll . 
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